keywords:"document classification" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"document classification"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Using of Data Mining Method for Analysis of Social Networks Novosad, Andrej ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor) Thesis discusses data mining the social media. It gives an introduction about the topic of data mining and possible mining methods. Thesis also explores social media and social networks, what are they able to offer and what problems do they bring. Three different APIs of three social networking sites are examined with their opportunities they provide for data mining. Techniques of text mining and document classification are explored. An implementation of a web application that mines data from social site Twitter using the algorithm SVM is being described. Implemented application is classifying tweets based on their text where classes represent tweets' continents of origin. Several experiments executed both in RapidMiner software and in implemented web application are then proposed and their results examined. Detailed record
	Analysis of Social Media Content Discussing Czech Mobile Operators Pavlů, Jan ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor) The main topic of this thesis is sentiment analysis of posts obtained from a social networks. The posts are about czech mobile network operators. The essential part of implemented system is also data visualization. The sentiment analysis is done using machine learning techniques. Downloaded posts are cleaned, lemmatized and transformed to feature vectors. Stochastic Gradient Descent algorithm is used for classification. Analyzed data are visualized in charts and as the list of posts. The system provides tools for text categorization. The accuracy, precision, recall and F1 score of sentiment analysis is about 75%. The accuracy of post categorization is high (about 80%), but precision, recall and F1 score are low (about 30%). This is the reason why post categorization isn't automatically done. The benefit of the system it that it automatically collects data from different sources, analysis them and displays them. It also provides tools for manual change of sentiment/categories which can lead to better system characteristics with some help of users. Detailed record
	Sentiment Analysis with Use of Data Mining Sychra, Martin ; Burget, Radek (referee) ; Bartík, Vladimír (advisor) The theme of the work is sentiment analysis, especially in terms of informatics (marginally from a linguistic point of view). The linguistic part discusses the term sentiment and language methods for its analysis, e.g. lemmatization, POS tagging, using the list of stopwords etc. More attention is paid to the structure of the sentiment analyzer which is based on some of the machine learning methods (support vector machines, Naive Bayes and maximum entropy classification). On the basis of the theoretical background, a functional analyzer is projected and implemented. The experiments are focused mainly on comparing the classification methods and on the benefits of using the individual preprocessing methods. The success rate of the constructed classifier reaches up to 84 % in the cross-validation. Detailed record
	Artificial Intelligence Document Classification Molnár, Ondřej ; Kačic, Matej (referee) ; Třeštíková, Lenka (advisor) This paper deals with document classification using artificial intelligence. It describes the principles of classification and machine learning. It also introduces AI methods and presents Naive Bayes classification method in detail. Provides practical implementation of the classifier in MS Office and discusses other possible extensions. Detailed record
	Content Document Classification Borčík, Filip ; Kačic, Matej (referee) ; Třeštíková, Lenka (advisor) This work deals with document classification based on standard family ISO/IEC 27000. Points to a need, but also issues of classification in corporate environment. The work also implements system for MS office documents classification based on content analysis using defined rules. This system is introduced into DocTag application developed by AEC company. Detailed record
	Deep Neural Networks for Historical Document Classification Pinkeová, Bettina ; Kohút, Jan (referee) ; Kišš, Martin (advisor) The aim of this work is to create a system for historical documents classification . The task is specifically about classification of documents according to the place of origin. Several systems are proposed for solving this problem, in the work. The first designed and implemented system is based on a convolutional neural network with a self-attention mechanism instead of an average pooling layer. Another system is based on the BEiT model, which is built on a visual transformer. The BEiT model was pretrained on the task of masked image modelling and subsequently trained on the given classification task. The system based on convolutional neural network achieved an accuracy of 81.6% and the system based on masked image modelling achieved an accuracy of 82.9%. The systems implemented in this work, surpassed the systems participating in the ICDAR 2021 conference in terms of success. Detailed record
	Automated contract classification for portal HlidacSmluv.cz Maroušek, Jakub ; Nečaský, Martin (advisor) ; Holub, Martin (referee) The Contracts Register is a public database containing contracts concluded by public institutions. Due to the number of documents in the database, data analysis is proble- matic. The objective of this thesis is to find a machine learning approach for sorting the contracts into categories by their area of interest (real estate services, construction, etc.) and implement the approach for usage on the web portal Hlídač státu. A large number of categories and a lack of a tagged dataset of contracts complicate the solution. 1 Detailed record
	Content Document Classification Borčík, Filip ; Kačic, Matej (referee) ; Třeštíková, Lenka (advisor) This work deals with document classification based on standard family ISO/IEC 27000. Points to a need, but also issues of classification in corporate environment. The work also implements system for MS office documents classification based on content analysis using defined rules. This system is introduced into DocTag application developed by AEC company. Detailed record
	Artificial Intelligence Document Classification Molnár, Ondřej ; Kačic, Matej (referee) ; Třeštíková, Lenka (advisor) This paper deals with document classification using artificial intelligence. It describes the principles of classification and machine learning. It also introduces AI methods and presents Naive Bayes classification method in detail. Provides practical implementation of the classifier in MS Office and discusses other possible extensions. Detailed record
	Analysis of Social Media Content Discussing Czech Mobile Operators Pavlů, Jan ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor) The main topic of this thesis is sentiment analysis of posts obtained from a social networks. The posts are about czech mobile network operators. The essential part of implemented system is also data visualization. The sentiment analysis is done using machine learning techniques. Downloaded posts are cleaned, lemmatized and transformed to feature vectors. Stochastic Gradient Descent algorithm is used for classification. Analyzed data are visualized in charts and as the list of posts. The system provides tools for text categorization. The accuracy, precision, recall and F1 score of sentiment analysis is about 75%. The accuracy of post categorization is high (about 80%), but precision, recall and F1 score are low (about 30%). This is the reason why post categorization isn't automatically done. The benefit of the system it that it automatically collects data from different sources, analysis them and displays them. It also provides tools for manual change of sentiment/categories which can lead to better system characteristics with some help of users. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English